Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(k8s): correctly resolve manifests when build is set #4516

Merged
merged 6 commits into from
Jul 7, 2023

Conversation

edvald
Copy link
Collaborator

@edvald edvald commented Jun 2, 2023

This ended up being a bit of a refactor, since we already had a TODO in code around resolving manifests when using build dependencies, and that needed to be resolved to properly address the issue.

kubernetes module/Deploy statuses now rely on a metadata ConfigMap, in principle akin to how Helm chart statuses are resolved. This avoids having to actually perform builds before working out the manifests that belong to a Deploy action.

Along the way, avoided superfluous API calls when comparing resources and resolving their statuses.

Closes #4505
Closes #4506

cc @stefreak

This ended up being a bit of a refactor, since we already had a TODO in code
around resolving manifests when using build dependencies, and that needed to
be resolved to properly address the issue.

`kubernetes` module/Deploy statuses now rely on a metadata ConfigMap, in
principle akin to how Helm chart statuses are resolved. This avoids having
to actually perform builds before working out the manifests that belong to
a Deploy action.

Along the way, avoided superfluous API calls when comparing resources and
resolving their statuses.
@edvald edvald requested a review from stefreak June 2, 2023 09:30
@edvald
Copy link
Collaborator Author

edvald commented Jun 2, 2023

Added a quick feature commit because I ran into it while working on this (no doubt quite welcome to those who have bumped into not being able to use globs in the files field for kubernetes modules/actions).

Copy link
Member

@stefreak stefreak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extremely good work @edvald!

I like how this has the side effect of reducing the number of API calls.

This is a big change and we need to carefully see if we can live with interactions of this change together with the Automatic Environment Cleanup feature of Garden Cloud and similar work arounds that are being used in the wild (e.g. #3438)

core/src/plugins/kubernetes/kubernetes-type/common.ts Outdated Show resolved Hide resolved
core/src/plugins/kubernetes/kubernetes-type/common.ts Outdated Show resolved Hide resolved
const api = await KubeApi.factory(log, ctx, k8sCtx.provider)

// FIXME: We're currently reading the manifests from the module source dir (instead of build dir)
// because the build may not have been staged.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean? Why has the build not been staged? 🤔
Trying to wrap my head around this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, to figure out wether or not the kubernetes deploy action is cached or not, and thus to figure out wether or not the build exec action needs to run, we need the manifest currently– now I start to understand.

We need to make sure that this new logic is being reflected in the "pause" logic of the cleanup runner in Garden Cloud (https://github.com/garden-io/garden-platform/blob/main/workflow-runner/src/runners/cleanup-runner/index.ts#L258) – if I understand this right, in the pause logic of the cleanup runner we now need to delete the config map, instead of overwriting the label?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good in the mid to long term if we standardise how the caching works across plugins, e.g. by using Garden Cloud as a backend and falling back to a local file backend if not connected to Cloud.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that would be good to have helpers for, maybe something built-in on ctx, but figure that's out of scope here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: pausing environments, we are of course planning to make that more native+explicit (incl. a garden pause command) but I'd need to defer on how this might affect the current routine used in Cloud/Enterprise. @10ko do you see a problem upfront here?

@stefreak
Copy link
Member

stefreak commented Jun 2, 2023

@edvald can you transfer some knowledge on Monday and Tuesday how this is supposed to work and how to classify #4505, #4506 and #4513 in light of this?

@edvald
Copy link
Collaborator Author

edvald commented Jun 2, 2023

@stefreak this closes #4506 but I think the other two issues still need attention.

Orzelius
Orzelius previously approved these changes Jun 7, 2023
Copy link
Contributor

@Orzelius Orzelius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as well. I gather this is good to merge?

@stefreak
Copy link
Member

stefreak commented Jun 7, 2023

Looks good to me as well. I gather this is good to merge?

Let's merge this soon, but it would be optimal if made a change in the Garden Cloud cleanup runner at least before releasing this, otherwise the pause feature will cease to be compatible with latest Garden. I will try to do that this week as I think it's not a big amount of work.

@eysi09
Copy link
Collaborator

eysi09 commented Jun 14, 2023

What's the status on this one? Can we merge? (cc @stefreak)

@stefreak
Copy link
Member

@eysi09 @edvald @Orzelius I did not yet have the chance to look into this but I think it's ok to merge and then make changes in AEC if even necessary.

@vvagaytsev
Copy link
Collaborator

@edvald there are some merge conflicts. Please resolve those and let's merge this :)

@vvagaytsev
Copy link
Collaborator

@edvald let's rebase this and merge

# Conflicts:
#	core/src/plugins/exec/exec.ts
#	core/src/plugins/kubernetes/container/status.ts
#	core/src/plugins/kubernetes/kubernetes-type/common.ts
#	core/src/plugins/kubernetes/kubernetes-type/handlers.ts
#	core/src/plugins/kubernetes/status/status.ts
#	core/test/integ/src/plugins/kubernetes/kubernetes-type/handlers.ts
#	yarn.lock
# Conflicts:
#	core/src/plugins/openshift/deploy.ts
Walther
Walther previously approved these changes Jul 7, 2023
Copy link
Contributor

@Walther Walther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good as far as I can tell! ✨
Approving but will let others review it too in case I missed something.

@vvagaytsev
Copy link
Collaborator

Hmm... Now there are some integration test failures in all VMs. I'll take a look.

@vvagaytsev
Copy link
Collaborator

Fixed the test failures in fecb878.

Copy link
Contributor

@Walther Walther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @vvagaytsev bringing this over the finish line! Thank you so much 🚀

@stefreak
Copy link
Member

@vvagaytsev @edvald @Walther @eysi09 did someone verify the AEC pause feature? Reading this comment sounds like there is definitely a change needed in Garden Cloud AEC

stefreak added a commit that referenced this pull request Jul 12, 2023
This reverts commit 79d6a13, reversing
changes made to 29621b9.
stefreak added a commit that referenced this pull request Jul 12, 2023
This reverts commit 79d6a13, reversing
changes made to 29621b9.
vvagaytsev pushed a commit that referenced this pull request Jul 13, 2023
vvagaytsev added a commit that referenced this pull request Jul 31, 2023
The regression was introduced in #4516.
Initial implementation of glob patterns in files paths
could result into empty list of files.
This fixes the behaviour by fail-fast existence checks for manifest files.
vvagaytsev added a commit that referenced this pull request Aug 1, 2023
* fix(k8s): throw on missing k8s manifests files

The regression was introduced in #4516.
Initial implementation of glob patterns in files paths
could result into empty list of files.
This fixes the behaviour by fail-fast existence checks for manifest files.

* test: add missing tests for `readManifest`

Test coverage for `spec.files` of `Deploy` action of `kubernetes` type.

* Add own Deployment manifest file, because builds are not executed in the test.
* Add unit tests to document and cover the expected behaviour.

Initial changes from c9efb47 did not have effect.
The Deploy action `with-build-action` has never been called with `readManifests`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants